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Prediction of events is the challenge in many different disciplines, from meteorology to finance; 
the more this task is difficult, the more a system is complex. Nevertheless, even according to this 
restricted definition, a general consensus on what should be the correct indicator for complexity is 
still not reached. In particular, this characterization is still lacking for systems whose time evolution 
is influenced by factors which are not under control and appear as random parameters or random 
noise. We show in this paper how to find the correct indicators for complexity in the information 
theory context. The crucial point is that the answer is twofold depending on the fact that the 
random parameters are measurable or not. The content of this apparently trivial observation has 
been often ignored in literature leading to paradoxical results. Predictability is obviously larger when 
the random parameters are measurable, nevertheless, in the contrary case, predictability improves 
when the unknown random parameters are time correlated. 
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In a number of systems the dynamics is influenced by 
uncontrolled parameters which are intrinsically random 
or cannot be predicted with necessary precision. The evo- 
lution of a system of this type is described in the frame- 
work of random dynamical systems, word which indicates 
in the present paper also dynamical systems with noise. 

A dynamical system can be eventually studied by 
means of the associated symbolic dynamics, which, in 
this case, correspond to a stochastic process with random 
conditional probabilities, i.e. probabilities which depend 
on the same stochastic parameters. 

The obvious thing is that the possibility of forecasting 
the future evolution strongly depends on the possibility 
of measuring the parameters. The same model will have 
a different complexity (predictability) according to the 
fact that the measure is feasible or not. Even if the con- 
tent this observation appears trivial it is often ignored 
in literature. For example, most frequently it is used a 
definition of complexity which considers the separation of 
nearby trajectories ||l|-g[ under the same realization of the 
noise. This definition implicitly assumes that the realiza- 
tion is known and should not be used when the contrary 
happens, has often it is. For example, the phenomenon 
of noise induced order |g] should be not considered a re- 
duction of complexity when the random disturbance is 
non measurable. 

A better characterization of complexity for dynami- 
cal systems with unmeasurable randomness has been re- 
cently found out for many physically relevant cases (0-0 ■ 

In this paper we show how to find proper indicators 
of complexity in the two cases of measurable (accessible 
information) and non measurable (inaccessible informa- 
tion) randomness. We also show, with an example, that 
in case the stochastic parameters have memory, part of 
the inaccessible information is encoded in the dynam- 
ics of the system and can be recovered. In other words. 



the gap between the two indicators of complexity reduces 
when the random parameters are time correlated. 

Let us start with some basic definition for the non 
random case also in order to establish the notation. 
Let us assume that the state of the system is identi- 
fied by the vector y{t) which evolves as a determinis- 
tic dynamical system according to y{t + 1) = f{y{t)). 
The corresponding phase space can be partitioned in 
regions indexed by a symbol x. The associated sym- 
bolic dynamics ..,x{l),x{2), ..,x{t), .. is the realization of 
a stochastic process with memory, i.e. the probability 
that the system is in a; (i + 1) depends on its past history 
..,a;(l),a;(2),..,x(i). 

The best characterization of predictability of a process 
with memory can been found in the information theory 
context, and it is the Shannon entropy S. Assume that 
/9(x„) is the probability that the sequence of n symbols 
x{t + 1), .., x{t + n) equals x„ = xi, ..,x„ , in this case 



Hn = -^ P{^n) l0gp(x„) 
{x„} 



(1) 



is the entropy of the sequence. Then, the entropy rate 
hn — Hn+i — Hn (n > 0, Hq = 0) measures the average 
information contained in n steps of the process. In fact, 
the probability that the system is in Xn+i if it was in 
xi, ..,x„ is p{xn+i\xn) = /9(x„+i )/p(x„) and one has 



/in = ^ p(x„) e„(x„) 

{x„} 



(2) 



where 



e„(x„) = - ^ p(x„+i|x„) logp(a;„+i|x„) . (3) 

{x„+i} 

Equation (H) measures the information we have on a;„_(_i 
if we know xi, .., a:„ and (Q) is the average with respect 



to all possible sequences a;i,..,x„. If /i„ = (the mini- 
muin possible) one has the maximum of information and 
next step can be predicted with certitude, on the con- 
trary, when hn attains its maximum, no information is 
available. 

Since, more of the past it is known more it is the infor- 
mation, the rate /i„ decreases when n increases and the 
Shannon entropy h = lim„^cxD hn measures the informa- 
tion we can extract from all the past history. It should 
be noticed that for a process which is ^-step Markovian 
hn = h for all n > Z. In this case knowledge of more of 
the last I steps does not help to predict future. 

The Shannon entropy of the symbolic sequence asso- 
ciated to a dynamical system is known as Kolmogorov 
e-entropy \^. Eventually, by taking the supremum with 
respect to all possible partitions it is possible to obtain 
the Kolmogorov-Sinai entropy, which, in turn, equals the 
sum of positive Lyapunov exponents. 

All above definitions deal with probabilities and with 
the idea that one has to consider many realizations of 
the process. In practice, what one has to do is simply to 
consider a single very long realization (much longer than 
M") of the process and to look for the frequency of any 
of the sequences of length n. 

The situation is less straightforward when one deals 
with a random dynamical system 



yit + l) = fiyit),iu{t + l)) 



(4) 



where a; (t -I- 1) is a random variable which can be ad- 
ditive noise or a random parameter. In this case, the 
associated symbolic dynamics have random conditional 
probabilities. In fact, x{t + 1) not only depends on the 
past history ..,a;(l),x(2), ..,x{t), but also on the present 
value uj{t+l) and the past history ..,uj{1),uj{2), ■■,uj{t) of 
a second stochastic process which account for the uncon- 
trolled random factors. In this case we call the first sub- 
ordinated process and the second fundamental process. 
The reason is that for the problem we have in mind, the 
fundamental process is autonomous, which means that 
it has memory but uj{t + 1) only depends on its own 
past history ..,uj{l),u;{2), ..,uj{t). Nevertheless, all con- 
siderations which follows also holds for the most gen- 
eral case in which the probability for uj{t + 1) also de- 
pends on the pasts history of the subordinated process 
on ..,a;(l),a;(2), ..,x{t — 1). In this more general case the 
distinction between fundamental and subordinated pro- 
cess is lost from a mathematical point of view, although 
it will remain very significant from the information point 
of view as we will see. 

The problem is again to quantify the predictability of 
the subordinated process, but we are now ready to un- 
derstand that two different kind of situation may arise. 
In the first the realization of the fundamental process is 
known (measurable or accessible information), in the sec- 
ond case it is unknown (non measurable or non accessible 



information) . Correspondingly, a different degree of pre- 
dictability on the subordinated process is expected. More 
precisely, there is more information and predictability in 
the first case (smaller entropy rate) and less in the second 
(larger entropy rate). 

Before entering into the problem let us consider a much 
simpler atemporal analogous which allows for clarifying 
the information context. Consider a (subordinated) ran- 
dom variable x whose probability p^{x) depends on the 
actual value of a second (fundamental) random variable 
Lu. One can think at this problem as if Puj{x) is the con- 
ditional probability for x given ui. 

The point we would like to focus can be better under- 
stood by the following example. Let us consider a coin 
toss in which the output is x, but two different coins may 
be used. The two coins are individuated by an index lo 
and they have different probabilities for x. Given this 
simple situation, two different games may be played; in 
both of them one has to guess the output of the toss but 
the rule is different. In the first game the player chooses 
at random the coin, gives a look at it, makes the guess 
and tosses; in the second he makes the guess and only 
after he chooses the coin at random and tosses. 

In the first game, at the moment of the guess, w is 
known and the information is measured by the entropy 
— J^Sx} Pi^i^) logp;j(x). Furthermore, since the coin has 
been chosen at random, the entropy of the game before 
the start is the average 



H = -Y2 <P^{x) log/3(^(a;) 



> 



(5) 



{x} 



One can easily check that H — Hx.ui — H^, where Hx^ui 
is the entropy of the couple of random variables and H^^ 
is the entropy of lo. 

If the second game, lo is unknown at the moment of 
the guess, and the utilized probability is the average 
<Puj{x)> Therefore, the information content is measured 
by 



H 



{x} 



<p^{x)> \og<p^{x)> 



(6) 



The inequalities H < H < Hx^ui hold. The first in- 
equality trivially means that in the second game one has 
less information than in the first. The equality holds only 
in the case of independence between the two random vari- 
ables. In turn, the second inequality becomes an equality 
only in case of complete dependence of the two variables 
i.e. at given value x corresponds deterministically only a 
single value w. 

The above discussion, although very simple, allows for 
the treatment of the intriguing case of stochastic process 
with random probability and, therefore, also the case of 
dynamical systems with noise. 

In the first scenario, once probabilistically generated, 
the sequence .., w(l), ..,a'(i), .. can be measured and it 



can be treated as a known ordinary time dependent func- 
tion. In consequence, the average entropy of a sequence 
of length n is given by 



Hn — — 



E 

{xi,..,x„} 



<Pu{Xn) l0gp^(x„)> 



(7) 



since pi^{x.n) is the probability for the sequence xi, ..,a;„ 
given the realization of the fundamental process. 

The entropy rate hn = H^+i — H^ represents now the 
average information one has on a;(i + 1) if one knows its 
previous n steps a;(l), ..,x{t) and also one knows all the 
fundamental process from the most recent a;(i-|- 1) to the 
more remote past. 

In this case, in analogy with the atemporal exam- 
ple, one can easily show that the Shannon entropy h = 
lim„^oo hn is the difference between the Shannon entropy 
associated with the couple process and that the Shannon 
entropy of the fundamental process i.e. h = h^^uj — ^w 
Therefore, from a practical point of view, what one has 
to do is to generate or to consider a very long sequence 
of both the processes and then measure the frequency of 
the sequences (xi; wi), .., (a;„; a;„),.. and, separately, of 
the sequence wi, ..,Ci;„, .. . 

Furthermore, h is nothing else that the characteriza- 
tion of complexity which refers to the separation of two 
nearby trajectories corresponding to the same realization 
of the fundamental process (noise). Notice, in fact, that 
one also trivially has h = lim„^oo Hn/n. Since i?„/n 
is an average of a quantity which is non random in the 
limit one can replace the average by simply considering 
the typical value, i.e. that value corresponding to a single 
realization of the fundamental process. This realization 
only plays the role of an ordinary function of time. The 
Pesin relation assures, in this case, that h equals the sum 
of positive Lyapunov exponents corresponding to the sep- 
aration of trajectories under the same realization of the 
fundamental process. This quantity is often taken as a 
measure of complexity for chaotic noisy systems. Nev- 
ertheless, let us stress once again that it is a measure of 
predictability only if the noise realization itself can be 
measured with infinite precision. 

Let us consider the more realistic case in which on the 
contrary uj{t) cannot be measured. In a way which is 
completely analogous to (H) one finds that the entropy 
associated to a sequence of length n is 



Hn 



{xi,..,Xn} 



<Puj{x)> log<p^{x)> 



(8) 



and the entropy rate ft,„ = Hn+i — Hn describes now 
the information content of n steps of the subordinated 
process when the fundamental process is unknown. The 
Shannon entropy h — lim„^oo hn is the maximum infor- 
mation which is available from the past. 

In practice, in order to obtain /i„, one has to gener- 
ate the long sequence .., x(l), x(2), ..,x{t), .. and measure 



the frequency of the sequences without regarding at the 
fundamental process. 

We stress that this characterization of predictability 
may differ a lot from the first one. For example, even non 
chaotic dynamical systems with negative or vanishing 
Lyapunov exponents {h — 0) may have a positive Shan- 
non entropy h and to be largely unpredictable. More gen- 
erally the inequality h < h < hx,uj holds. The qualitative 
understanding of this inequality is straightforward since 
h refers to the case in which the additional information 
on the fundamental process is available. On the other 
side, it is easy to check that h is smaller or equal than 
the entropy of the couple process. Since h = h^^ui — h^ 
one can also rewrite hx,uj — h^ < h < h^.u- The first in- 
equality becomes an equality only when the two process 
are reciprocally independent, while the the second one 
when they are deterministically linked. 

Unfortunately, while one has that h can be computed 
by means of the Lyapunov exponents of a typical tra- 
jectory, there is not a corresponding general recipe for 
h. Nevertheless, a very sensitive approximate approach 
based on the use of the rate of separation of two nearby 
trajectories which correspond to two different realization 
of the fundamental has been recently proposed ^-0] . 

Let us now give a simple example which may help to 
understand an important point: a subordinated process 
which is memoryless when the realization of the funda- 
mental process is given, may have a long memory when, 
on the contrary, the fundamental process is unknown. 
This is because part of the information carried by the fun- 
damental is encoded in the history of the subordinated. 
The long memory of the subordinated may be induced 
even if the fundamental itself is Markov. The more the 
fundamental is correlated, longer is the memory of the 
subordinated. As a consequence, the difference between 
h and h decreases when the correlation of the fundamen- 
tal increases and it disappears in case this correlation is 
complete. 

Let us consider, as an example, the following simple 
dynamical system 



y(i + l) = 2-(*+i)y(t) 



mod. 1 



(9) 



where the w(i) are Markovian variables which can take 
the two possible values 0, 1 with equal probability and 
which persist in their value with probability p. 

Now let us consider the most trivial partition of the 
accessible phase space: the two intervals [0, 1/2), [1/2, 1) 
identified respectively by the symbols a; = and x — 1. 
The symbolic dynamics is very easy to construct: when 
uj{t+l) = then x(i -1-1) = x{t), when a;(i-(-l) = 1 then 
x{t + 1) = 0, 1 with same probability 1/2. 

Than it is straightforward to obtain, independently on 
p the entropy rate hn — ^log2 for n > 1. This entropy 
equals the typical rate of separation of the trajectories. 

On the contrary /i„ strongly depends on p. If p = 



than hn 



■ log I — 7 log i for 71 > 1. This result can be 



easily interpreted in this way: the averaged probabihty 
that x{t) persist in its value is 3/4, while the averaged 
probability for a change is 1/4. 

In fig 1 . hn is plotted for three different values oip ^ 
and compared both /i„ and the value of /i„ corresponding 
to p = 0. When p / 0, one see from fig 1. that ft.„ 
is strictly monotone and converges to h in more than 
a single step. This behavior implies that in absence of 
informations on the fundamental the subordinated is not 
Markovian even if the couple process itself is Markovian. 
This behavior also implies that when n increases, the 
knowledge of the subordinated trajectory can be used to 
better forecast. 

Furthermore, one observes in fig. 1 that the asymptotic 
difference h — h reduces when p increases and almost 
vanishes for very large p. This is of easy understanding 
since, in practice, one guesses the value of x{t + 1) by 
looking at the length of the previous sequence of symbols 
with same persistent value. The larger is p, the longer 
are these sequences which carry information. 

In other words, since the more it is possible to use 
of the past, the more the information is recovered, the 
difference h — h reduces when p increases. 




Time steps 

FIG. 1. The entropy rate /i„ is plotted for three dif- 
ferent values of p: p = .9 (crosses), p = .99 (slanting 
crosses) ,p = .999 (asterisks). For comparison, both h„ (line) 
and the value of h„ corresponding to p = (dots) are re- 
ported. 

This example shows that the topic of this paper may 
be quite relevant when one deal with historical data, i.e. 
single non reproducible sequences of symbols, where the 
joint probabilities depend on some stochastic parame- 
ters. For example this phenomenology is typical of fi- 
nance where the random parameters reflect economic fac- 
tors which may be unknown to a given investor. Investors 
which trust in fundamental analysis strongly believe that 
all information is reflected in the price dynamics. The 
previous example also shows the limit of this point of 
view. In fact, the missing information is only partially 
reflected by data and only information on longly persis- 



tent macro-economic factors can be totally recovered. 

Let us resume the result of this paper as follows: 
if the fundamental process is known, the predictability is 
measured by h which can be computed in practice as the 
Shannon entropy of a single long realization of the couple 
process minus the Shannon entropy of a single long real- 
ization of the fundamental process. This entropy equals 
the sum of positive Lyapunov exponents associated to 
the separation of nearby trajectories under the same re- 
alization of the noise; 

if the fundamental process is unknown, the predictability 
is measured by h which can be computed in practice as 
the Shannon entropy of a single long realization of the 
subordinated process. Unfortunately, there is not an ex- 
act recipe which allows for the calculation of h by means 
of Lyapunov exponents and some more refined approxi- 
mate techniques have to be used 
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